Extracting Baseline Electricity Usage with Gradient Tree Boosting

نویسندگان

  • Taehoon Kim
  • Dongeun Lee
  • Jaesik Choi
  • Anna Spurlock
  • Alex Sim
  • Annika Todd
  • Kesheng Wu
چکیده

To understand how specific interventions affect a process observed over time, we need to control for the other factors that influence outcomes. Such a model that captures all factors other than the one of interest is generally known as a baseline. In our study of how different pricing schemes affect residential electricity consumption, the baseline would need to capture the impact of outdoor temperature along with many other factors. In this work, we examine a number of different data mining techniques and demonstrate Gradient Tree Boosting (GTB) to be an effective method to build the baseline. We train GTB on data prior to the introduction of new pricing schemes, and apply the known temperature following the introduction of new pricing schemes to predict electricity usage with the expected temperature correction. Our experiments and analyses show that the baseline models generated by GTB capture the core characteristics over the two years with the new pricing schemes. In contrast to the majority of regression based techniques which fail to capture the lag between the peak of daily temperature and the peak of electricity usage, the GTB generated baselines are able to correctly capture the delay between the temperature peak and the electricity peak. Furthermore, subtracting this temperature-adjusted baseline from the observed electricity usage, we find that the resulting values are more amenable to interpretation, which demonstrates that the temperature-adjusted baseline is indeed effective.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-class Image Classification Based on Fast Stochastic Gradient Boosting

Nowadays, image classification is one of the hottest and most difficult research domains. It involves two aspects of problem. One is image feature representation and coding, the other is the usage of classifier. For better accuracy and running efficiency of high dimension characteristics circumstance in image classification, this paper proposes a novel framework for multi-class image classifica...

متن کامل

Power System Parameters Forecasting Using Hilbert-Huang Transform and Machine Learning

A novel hybrid data-driven approach is developed for forecasting power system parameters with the goal of increasing the efficiency of short-term forecasting studies for non-stationary time-series. The proposed approach is based on mode decomposition and a feature analysis of initial retrospective data using the Hilbert-Huang transform and machine learning algorithms. The random forests and gra...

متن کامل

Gradient Tree Boosting for Training Conditional Random Fields

Conditional random fields (CRFs) provide a flexible and powerful model for sequence labeling problems. However, existing learning algorithms are slow, particularly in problems with large numbers of potential input features and feature combinations. This paper describes a new algorithm for training CRFs via gradient tree boosting. In tree boosting, the CRF potential functions are represented as ...

متن کامل

GEFCOM 2014 - Probabilistic Electricity Price Forecasting

Energy price forecasting is a relevant yet hard task in the field of multi-step time series forecasting. In this paper we compare a wellknown and established method, ARMA with exogenous variables with a relatively new technique Gradient Boosting Regression. The method was tested on data from Global Energy Forecasting Competition 2014 with a year long rolling window forecast. The results from th...

متن کامل

Demand Prediction of Bicycle Sharing Systems

Bike sharing system requires prediction of bike usage based on usage history to re-distribute bikes between stations. In this study, original data was collected around Washington D.C. area in 2011 and 2012. Original data is processed by several feature engineering approaches based on analysis and understanding of the data. Ridge linear regression, support vector regression (εSVR), random forest...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2016